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REMARKS 

With this amendment, claims 14, 17, 20, 22, 39, 42, 45-47, and 58 have been 
amended for clarity. In addition, URLs in the specification have been amended in such a 
manner that they have been deactivated. No new matter has been added by virtue of these 
claim amendments or the amendments to the specification. Upon entry of the present 
amendments, claims 14, 15, 17, 20-22, 39, 40, 42, 45-47, and 58 will remain pending in the 
above-identified application. 

THE OBJECTIONS TO THE SPECIFICATION SHOULD BE WITHDRAWN 

In the July 20, 2005 Office Action, the Examiner objected to embedded hyperlinks 
that appear on pages 12, 13, and 23 of the specification. In response, Applicants have 
amended these hyperlinks in such a manner that they are inactivated. Applicants have 
stripped the "http://" designation from each of the links. In this way, the specification still 
discloses the hyperlink addresses, but does so in a manner that will not accidentally lead to 
active hyperlinks in the specification when the patent application issues and is published on 
the USPTO web site. Accordingly, Applicants request that the objection to the specification 
be withdrawn. 

THE 35. U.S.C. § 112 SECOND PARAGRAPH REJECTION SHOULD BE 

WITHDRAWN 

The Examiner has rejected claims 14, 15, 17, 20-22, 39, 40, 42, 45-47, and 58 under 
35 U.S.C. § 1 12, second paragraph, for lack of antecedent basis of the phrase "the correlation 
value associated with the respective genotypic data structures" in independent claims 14, 17, 
20, 22, 39, 42, 45-47, and 58. Claims 15, 21, and 40 are rejected for being dependent from 
rejected claim 14, 20, or 39. In response, Applicants have amended each of these 
independent claims in the same manner. As an example, claim 14 has been amended in 
relevant part as follows: 

repeating said establishing and determining steps for each locus in said 
plurality of loci, thereby establishing a plurality of genotypic data structures 
and, for each respective genotypic data structure in the plurality of genotypic 
data structures, an associat e d determining a correlation value; 

identifying one or more genotypic data structures in said plurality of 
genotypic data structures, wherein the correlation value for each respective 
genotypic data structure in the one or more genotypic data structures has th e 
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prop e rty that tho corr e lation valu e associated with th e r e sp e ctive g e notypio 
data structure forms is a high correlation value relative to the correlation 
values of genotypic data structures in said plurality of genotypic data 
structures that are not in said one or more genotypic data structures; wherein 
the loci that correspond to said one or more genotypic data structures represent 
said one or more candidate chromosomal regions that associate with said 
phenotvpe and wherein an amount of said genome that is included in each 
locus in said plurality of loci is predetermined; 

Each of the independent claims has been amended in this manner. 

With the above-described claim amendments, it is now clear that the repeating step of 
each of the independent claims causes a correlation value to be determined for each genotypic 
data structure in a plurality of data structures. The claimed repeating step is supported by 
Fig. 2 of the specification, in particular steps 206 and 208 coupled with the step 212 of Fig. 2. 
In steps 206 and 208 a genotypic data structure is established and a correlation value is 
determined for the genotypic structure data structure as described in the specification, for 
example, on page 23, line 16, through page 26, line 16. Step 212, the repeating step in which 
the establishing and determining steps are repeated for a different genotypic data structure, is 
described, for example, in the specification on page 26, line 21, through page 27, line 15. 
Thus, in each of the independent claims, as amended, a correlation value is determined for 
each genotypic data structure in a plurality of genotypic data structures by operation of the 
repetition of the establishing (Fig. 2, step 206) and determining steps (Fig. 2, step 208) in the 
same manner illustrated in Fig. 2 of the specification. 

The identifying step of each of the independent claims is supported by steps 214 and 
216 of Fig. 2 of the specification. Steps 214 and 216 are described, for example, on page 27, 
line 26, through page 27, line 17, of the specification. Here, consistent with the independent 
claims as amended, the specification teaches that genotypic data structures that have high 
correlation values are selected. 

The Examiner has further rejected claims 14, 15, 17, 20-22, 39, 40, 42, 45-47, and 58 
under 35 U.S.C. § 1 12, second paragraph, for lack of antecedent basis for reciting the phrase 
"the property" in independent claims 14, 17, 20, 22, 39, 42, 45-47, and 58. Claims 15, 21, 
and 40 are rejected for being dependent from rejected claim 14, 20, or 39. In response, 
Applicants have amended these independent claims to cancel the phrase "the property" and to 
amend the phrase "the associated correlation value" in favor "determining a correlation 
value." 
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In view of the above-identified claim amendments, Applicants believe that each of the 
35 U.S.C. § 1 12, second paragraph, rejections raised by the Examiner have now been 
obviated. Accordingly, Applicants respectfully request that the 35 U.S.C. § 112, second 
paragraph, rejection of the pending claims be withdrawn. 



THE 35. U.S.C. § 112 FIRST PARAGRAPH REJECTION SHOULD BE 

WITHDRAWN 

The Examiner has rejected claims 14, 15, 17, 20-22, 39, 40, 42, 45-47, and 58 under 
35 U.S.C. § 1 12, first paragraph, for allegedly failing to comply with the written description 
requirement. In particular, the Examiner contends that the claim limitation "the one or more 
genotypic data structures has the property that the correlation value associated with the 
respective genotypic data structure" is not found in the instant specification. Applicants have 
amended this claim limitation in the manner described above in the section addressing the 35 
U.S.C. § 1 12, second paragraph, rejections. Specifically, the following claim amendment has 
been made: 

wherein the correlation value for each respective genotypic data structure in 
the one or more genotypic data structures has th e prop e rty that th e corr e lation 
value associated with th e r e spectiv e g e notypic data structur e forms is a high 
correlation value relative to the correlation values of genotypic data structures 
in said plurality of genotypic data structures that are not in said one or more 
genotypic data structures 

This claim limitation, as amended, is supported by steps 214 and 216 of Fig. 2 of the instant 
specification. Steps 214 and 216 are described, for example, on page 27, line 26, through 
page 27, line 17, of the specification. Here, consistent with the independent claims as 
amended, the specification teaches that genotypic data structures that have high correlation 
values are selected. Additional support for this claim limitation is found on page 5, lines 1-10 
of the specification reproduced for the Examiner's convenience: 

The phenotypic and genotypic data structures are then compared to form a 
correlation value. The process continues with the establishment of another 
genotypic data structure that corresponds to a different loci and the 
concomitant comparison of this genotypic data structure to the phenotypic 
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structure until several of the loci in the genome of the organism have been 
tested in this manner. In this way, one or more genotypic data structures are 
identified that form a high correlation value relative to all other genotypic data 
structures that have been compared to the phenotypic data structure. Further, 
the loci in the genome of the organism that correspond to the highly correlated 
genotypic data structures represent one or more candidate chromosomal 
regions that may be associated with the phenotype of interest. 

This claim passage illustrates how one or more genotypic data structures are identified 
that each form a high correlation value relative to all other genotypic data structures, 
consistent with the limitation identified by the Examiner. Thus, Applicants contend 
that the claim limitation identified by the Examiner is fully supported by the 
specification in its amended form. 

The Examiner has rejected claims 14, 15, 17, 20-22, 39, 40, 42, 45-47, and 58 under 
35 U.S.C. § 1 12, first paragraph, on the additional basis that the claim limitation "plurality of 
genotype data structures that are not in said one or more genotypic data structures" is not 
found in the instant specification. Applicants have not amended this claim limitation in the 
instant response because Applicants respectfully disagree with the Examiner on this point. 
A plurality of genotypic data structures are established in the instant application and in each 
of the independent claims. This plurality of genotypic data structures is then divided into two 
classes: (i) the identified one or more genotypic data structures that have high correlation 
values and (ii) those genotypic data structures in the plurality of genotypic data structures that 
are not in the identified one or more genotypic data structures {i.e., that do not have high 
correlation values). 

Applicants have noted several passages in the specification where it is clear that one 
or more genotypic data structures are selected from a plurality of genotypic data structures. 
For example, on page 5, lines 1-10, of the specification, reproduced above, "one or more 
genotypic data structures are identified that form a high correlation value relative to all other 
genotypic data structures that have been compared to the phenotypic data structure." This 
claim limitation is further supported by step 216 of Fig. 2. Page 28, lines 8-9, of the 
specification states that, in processing step 216, the genotypic data structures that achieve the 
highest correlation values are selected. The specification makes it very clear that these 
genotypic data structures that achieve the highest correlation values are relative to those 
genotypic data structures not selected. For example, on page 28, lines 12-14, the 
specification states that "[i]n one embodiment, the selection process in processing step 216 is 
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performed by selecting genotypic data structures that form a correlation value that is a 
predetermined number of standard deviations above the mean correlation value. 

In view of the above-identified claim amendments, Applicants believe that each of the 
35 U.S.C. § 1 12, first paragraph, rejections raised by the Examiner have now been obviated. 
Accordingly, Applicants respectfully request that the 35 U.S.C. §112, first paragraph, 
rejection of the pending claims be withdrawn. 

THE 35. U.S.C. § 101 REJECTION SHOULD BE WITHDRAWN 

The Examiner has rejected claims 14, 15, 17, 20-22, 39, 40, 42, 45-47, and 58 under 
35 U.S.C. § 101 because the claimed invention is allegedly directed to non-statutory 
algorithm type subject matter. For the reasons discussed below, Applicants respectfully 
traverse the rejection. 

The rejection of the claims under 35 U.S.C. § 101 in the July 20, 2005 Office Action 
represents the second time the claims have been rejected on the very same basis. The first 
time the claims were rejected on this basis was in the October 19, 2004 Office Action, where, 
on page 3, the Examiner stated that the claims were rejected because they are "directed to a 
method comprising algorithmic steps for manipulating genotypic and phenotypic data without 
any physical alteration step, which is considered to be non-statutory subject matter." This per 
se data transformation test of Applicants' claims is in complete contradiction to the 
instructions provided by John J. Doll, Commissioner for Patents, in the October 26, 2005, 
Interim Guidelines for Examination of Patent Applications for Patent Subject Matter 
Eligibility (hereinafter "Interim Guidelines")} Page 20 of the Interim Guidelines explains: 

b. Practical Application That Produces a Useful, Concrete, and Tangible 
Result 

For eligibility analysis, physical transformation "is not an invariable 
requirement, but merely one example of how a mathematical algorithm [or law 
of nature] may bring about a useful application." AT&T , 172 F.3d at 1358-59, 
50 USPQ2d at 1452. If the examiner determines that the claim does not entail 
the transformation of an article, then the examiner shall review the claim to 



A copy of the Interim Guidelines is attached as Exhibit A. Specifically, page 42 of 
the Interim Guidelines states that a per se data transformation test is not to be applied by 
Examiners in determining whether the claimed invention is patent eligible subject matter. 
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determine if the claim provides a practical application that produces a useful, 
tangible and concrete result. 

Applicants respectfully submit that the rejected claims each provides a useful, concrete, and 
tangible result. 

Applicants 9 claims provide a useful result According to page 20 of the Interim 
Guidelines, a claim-provides a useful result when it is: (i) specific, (ii) substantial, and (iii) 
credible. Applicants' claims are for specific methods, computer program products, and 
computer systems that associate a phenotype with one or more chromosomal regions (termed 
"candidate chromosomal regions") in a genome of a species. This utility is specifically 
claimed in each of the independent claims. For example, the preamble of claim 14 begins: 

A method of associating a phenotype with one or more candidate 
chromosomal regions 

and, in the claim body, positively recites the limitation: 

wherein the loci that correspond to said one or more genotypic data structures 
represent said one or more candidate chromosomal regions that associate with 
said phenotype . 

Thus, through the claimed identification of genotypic data structures that have high 
correlation scores, loci that correspond to such genotypic data structures are identified as the 
one or more candidate chromosomal regions that associate with a phenotype. The association 
of a phenotype with one or more candidate chromosomal regions is a substantial utility 
because it helps to identify the genes that are responsible for that particular phenotype, and is 
credible. This is evidenced by the reception of Applicants' work in the scientific community. 
For example, Applicants' claimed invention is reported in Grupe et al. 9 2001, Science 
292: 191 5-191 8, 2 the abstract of which states: 

Experimental murine genetic models of complex human disease show great 
potential for understanding human disease pathogenesis. To reduce the time 
required for analysis of such models from many months down to milliseconds, 
a computational method for predicting chromosomal regions regulating 
phenotypic traits and a murine database of single nucleotide polymorphisms 
were developed. After entry of phenotypic information obtained from inbred 



2 Grupe et a/., 2001, Science 292:1915-1918 is enclosed as Exhibit B. 
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mouse strains, the phenotypic and genotypic information is analyzed in silico 
to predict the chromosomal regions regulating the phenotypic trait. 

As the Science abstract explains, using Applicants' claimed invention, genetic models can be 
rapidly analyzed. Thus, Applicants' claims provide a useful result because they are directed 
to an invention that is (i) specific, (ii) substantial, and (iii) credible. 

Applicants 9 claims provide a concrete result According to page 22 of the Interim 
Guidelines, the question of whether a claim provides a "concrete" arises when a result cannot 
be assured. Thus, the requirement for providing a concrete result is not a physical alteration 
requirement. Rather, as noted on page 22 of the Interim Guidelines, the test for concreteness 
is satisfied when the result is substantially repeatable. As further noted in the Interim 
Guidelines, the opposite of "concrete" is unrepeatable or unpredictable. Applicants' claims 
provide highly repeatable results, as evidenced by the experimental results in Applicants' 
specification and the fact that representative work in accordance with Applicants' claims is 
published in the reputable peer-reviewed journal Science. 

Applicants 9 claims provide a tangible result According to current USPTO policy, 
the "tangible result" requirement is not a physical alteration requirement: "[t]he tangible 
requirement does not necessarily mean that a claim must either be tied to a particular machine 
or apparatus or must operate to change articles or materials to a different state or thing." 
{Interim Guidelines, p. 21.) All that is required is to set forth a practical application. As 
discussed above, Applicants have satisfied this requirement. Applicants have identified 
methods, computer program products, and computer systems that have the practical 
application of associating a phenotype with one or more chromosomal regions (termed 
"candidate chromosomal regions") in a genome of a species. The claimed invention can 
reduce the time required for analysis of genetic models from many months down to 
milliseconds. 

Summary. As the above discussion explains, Applicants claims are directed to a 
practical application that produces a useful, tangible, and concrete result. Accordingly, 
Applicants respectfully request that the 35 U.S.C. § 101 rejection of claims 14, 15, 17, 20-22, 
39, 40, 42, 45-47, and 58 be withdrawn. 
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CONCLUSION 



In view of the above remarks, Applicants respectfully submit that the subject 
application is in good and proper order for allowance. Withdrawal of the Examiner's 
rejections and objections and early notification to this effect are earnestly solicited. 

No fee is believed owed in connection with filing of this amendment and response. 
However, should the Commissioner determine otherwise, the Commissioner is authorized to 
charge any underpayment or credit any overpayment to Jones Day Deposit Account No. 50- 
3013 for the appropriate amount. A copy of this sheet is attached. 



Respectfully submitted, 



Date: 



December 19, 2005 



rett Eovejoy ^ / 




42,813 
(Reg. No.) 



JONES DAY / 

222 East 41 st Street ( j 

New York, New York 10017-6702 
(415) 875-5744 
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EXHIBIT A 



OCTOBER 26, 2005, INTERIM GUIDELINES FOR EXAMINATION OF PATENT 
APPLICATIONS FOR PATENT SUBJECT MATTER ELIGIBILITY 
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GRUPE ETAL., 2001, SCIENCE 292:1915-1918 
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of MDM2 (14) (Fig. 2), inhibition of 
MDM2 binding, although preventing p53 
degradation, would not block p53 nuclear 
export and thus would not efficiently accxi- 
rnulate p53 in the nucleus to allow maximal 
p53 activation. On the other hand, inhibit- 
ing p 53 nuclear export without breaking its 
binding with MDM2, although causing the 
nuclear accumulation of p53, would not 
reach maximal p53 activation either be- 
cause MDM2, in addition to its activity in 
promoting cytoplasmic p53 degradation, 
can also directly inhibit p53's transact ivat- 
ing activity in the nucleus (4). We suggest 
that DNA damage-induced phosphorylation 
may achieve optimal p53 activation through the 
additive and complementary action of both in- 
hibiting MDM2 binding to, and the nuclear 
export of, p53. 
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Identification of genetic susceptibility loci 
has promised insight into pathophysiologic 
mechanisms and the development of thera- 
pies for common human diseases. Analysis 
of experimental murine genetic models of 
human disease biology should greatly facil- 
itate identification of genetic susceptibility 
loci for common human diseases. We 
present a computational method that mark- 
edly accelerates genetic analysis of murine 
disease models. A linkage prediction pro- 
gram scans a murine single nucleotide 
polymorphism (SNP) database and, only on 
the basis of known inbred strain phenotypes 
and genotypes, predicts the chromosomal 
regions that most likely contribute to com- 
plex traits. The computational prediction 
method does not require generation and 
analysis of experimental intercross proge- 
ny, but it correctly predicted the chromo- 
somal regions identified by analysis of ex- 
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perimental intercross populations for mul- 
tiple traits analyzed. 

A Web-accessible database was devel- 
oped, which contains allele information 
across 15 inbred strains and specifies geno- 
typing assays for over 500 SNPs at defined 
locations in the mouse genome (http:// 
mouseSNP.roche.com). These SNPs were 
identified in our laboratories by direct se- 
quencing of polymerase chain reaction (PCR) 
amplification products from defined chromo- 
somal locations. This database also incorpo- 
rates published allele information for 2848 
SNPs, 45% of which are characterized in a 
subset of Mus musculus strains; 55% of the 
SNPs are polymorphic between Mus casia- 
neus and one or more M. musculus subspe- 
cies (7). User queries regarding SNPs found 
within a specified chromosomal region or 
between selected inbred strains are executed 
in real time and provided through a graphical 
user interface. The oligonucleotide primer se- 
quences and conditions for performing allele- 
specific kinetic PCR genotyping assays (2) 
are also provided in the tnSNP database [see 
supplemental material (3)]. 

To demonstrate the utility of this informa- 
tion, the genome of pooled DNA samples 
obtained from intercross progeny was ana- 
lyzed by two different genotyping methods. 
At 16 weeks of age, the 1000 F 2 progeny of 
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a C57BL/6 X B6D2 intercross display a non- 
sex linked, normal distribution of bone min- 
eral density (BMD) (4). Phenotypically ex- 
treme F 2 progeny with the highest (;/ = 150 
mice) and lowest (n = 149 mice ) BMD (top 
and bottom 15%, respectively), were subject- 
ed to a whole-genome scan for association 
with BMD by genotyping individual DNA 
samples with 1 12 microsatellite markers. In 
addition, equal amounts of DNA from the 
high and low BMD F 2 progeny was used to 
form two pools of DNA samples. Allele fre- 
quencies in the pooled samples were mea- 
sured for 109 SNPs found in the mSNP da- 
tabase with the use of the previously de- 
scribed allele-speciflc kinetic PGR method 
(2). Differences in allele frequency between 



the two extremes for each marker were 
scored. If a marker has no association with 
BMD, its expected frequency is 50% for both 
extremes. The significance of each allele- 
frequency difference was calculated using the 
z-test and plotted as a lod score (a logarithm 
of the odds ratio for linkage) (Fig. 1). A 
significant association (lod score > 3.3) was 
found for four regions on chromosomes 1 , 2, 
4, and 11 by the microsatellite and SNP 
genotyping methods. SNP-based genotyping 
identified a linkage region near the centro- 
mere of chromosome 13, which was not 
found using microsatellite markers. Two SNP 
markers (2.2 and 6.6 cM) were more proxi- 
mal to the centromere of chromosome 13 
than the most proximal (10 cM) microsatel- 



lite marker used for genotyping the intercross 
progeny. This region is being investigated 
with additional markers. 

SNP-based genotyping of pooled samples 
required about 20- fold fewer PGR reactions 
and was performed much more quickly than 
microsatellite genotyping of individual DNA 
samples. Replicate determinations (four 
times) were performed here to assess the 
reproducibility of the SNP-allele frequency 
determination and measurement error. On av- 
erage, the standard deviation in allele fre- 
quency measurement was ± 1.7%. In the fu- 
ture, it should be possible to reduce the num- 
ber of replicate PGR assays. 

We wanted to determine whether chromo- 
somal regions regulating quantitative traits 
(QTL intervals) could be computationally 
predicted with the use of the mSNP database 
and available phenotypic information on in- 
bred strains. Using the allelic distributions 
across inbred strains contained in the mSNP 
database, the computational method calcu- 
lates genotypic distances between loci for a 
pair of mouse strains. These genotypic dis- 
tances are then compared with phenotypic 
differences between the two mouse strains. 
The process is repeated for all mouse strain 
pairs for which phenotypic information is 
available. Lastly, a correlation value is de- 
rived using linear regression on the pheno- 
typic and genotypic distances for each 
genomic locus. 

As a first example, we used the computa- 
tional method to predict the chromosomal 
location of the major histocompatibility com- 
plex (MHC) complex, which has been 
mapped to murine chromosome 17, using the 
known H2 haplotypes for the MHC K locus 
for 10 inbred strains (5). Phenotypic distanc- 
es for strains that shared a haplotype were set 
to zero, and a distance of one was used for 
strains of different haplotypes. The SNPs 
within and near the MHC region had a geno- 
typic distribution that was highly correlated 
with the phenotypic distances; the correlation 
value for this interval was 5.3 standard devi- 
ations above the average for all loci analyzed. 
No other peaks in the mouse genome exhib- 
ited a comparable correlation with this phe- 
notype (Fig. 2 A). This computational analy- 
sis, which required less than 1 s to run on a 
standard desktop computer, excluded 98% of 
the mouse genome from consideration with- 
out missing the genomic region known to 
contain the MHC. 

In addition to the MHC locus, we tested 
the computational method using nine quanti- 
tative traits known from published studies 
that provided mapped QTL intervals and phe- 
notypic data across multiple inbred strains for 
each trait (Table 1) (3). The ability of this 
algorithm to identify chromosomal regions 
regulating susceptibility to experimental al- 
lergic asthma was investigated. Analysis of 
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Fig. 1. Comparison of SNP-based genotyping of pooled DNA samples with microsatellite geno- 
typing of individual DNA samples. Phenotypically extreme F 2 progeny from a B6D2 intercross with 
the highest and lowest BMD were subjected to whole-genome scanning for association with BMD 
by genotyping either individual DNA samples (from 299 mice) with 112 microsatellite markers or 
two pooled DNA samples (150 mice per pool) with 109 SNP markers. The significance of each 
allele-frequency difference was calculated using the z-test and plotted as a lod score for all 
chromosomes. Dashed line indicates a lod score of 3.3, the threshold for genome-wide significance. 



Table 1. Comparison between experimentally identified QTL intervals with computationally predicted 
chromosomal regions for 10 phenotypic traits. The experimentally identified QTL intervals and pheno- 
typic information used for computational prediction are described in the references indicated and are 
summarized in supplementary tables 1 and 2 (3). PKC, protein kinase C; Exp., total number of 
experimentally verified QTL intervals; Correct, number of computationally predicted regions that overlap 
with the experimentally verified locus; Predicted, total number of predicted regions for each phenotype; 
Cutoff, percentage of the mouse genome included within the computationally predicted regions. 
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Phenotype 


Reference 


Exp. 


Correct 


Predicted 


Cutoff (%) 


AHR 


(6, 7) 




4 


6 


15 


Alcohol preference 


(72-76) 




3 


6 


16 


Alcohol withdrawal 


(17) 




2 


5 


10 


BMD 


(4, 78) 




2 


2 


4 


Eye weight 


(79) 




1 


4 


10 


Ganglion cell count 


(20) 




1 


2 


4 


Lymphoma 


(21, 22) 




3 


4 


8 


MHC 


(5) 




1 


1 
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PKC activity 


(23) 




1 


2 
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PKC content 


(23) 




1 


6 


14 


Total 


26 


19 


38 
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intercross progeny between susceptible (A/J) 
and resistant (C3H/HeJ) mouse strains iden- 
tified a QTL interval on chromosome 2 and a 
suggested interval on chromosome 7 (<5). 
Analysis of a different experimental inter- 
cross identified QTL intervals on chromo- 
somes 10 and 11 (7). Phenorypic measure- 
ments for allergen-induced airway hyperre- 
sponsiveness (AHR) in four inbred strains 
was used for a computational genome scan. 
The experimentally identified QTL intervals 
on chromosome 2, 7, 10, and 1 1 were among 
the strongest peaks identified by the compu- 
tational genome scan (Fig. 2B). The compu- 
tational method excluded 85% of the mouse 
genome from consideration without missing 
the experimentally mapped QTL regions. 

The ability of the computational method 
to correctly predict chromosoma! regions 
containing experimentally verified QTL in- 
tervals was evaluated using 10 phenorypic 
traits (Table 1) (3). The percentage of correct 
predictions was characterized as a function of 
the percentage of the mouse genome con- 
tained within the predicted chromosomal re- 
gions. If predicted regions contained 10% of 
the mouse genome (by selecting 10% of the 
peaks with the highest correlation), then 15 of 
the 26 experimentally verified QTL intervals 
were correctly identified. As the threshold 
was raised, limiting die number of predicted 
candidate regions, more experimentally veri- 
fied QTL intervals were missed. In summary, 
at cutoff values ranging from 2 to 16%, 19 of 
26 experimentally verified QTL intervals reg- 
ulating 10 phenotypic traits were correctly 
identified (Table 1). 

We applied a Fisher Exact test to assess 
the significance of the computational predic- 
tions. The average size of a predicted genom- 
ic region was 38 cM, segmenting the 1500- 
cM mouse genome into 40 regions. There- 
fore, a total of 400 genomic intervals were 
analyzed for the 10 quantitative traits exam- 
ined. At a 10% genome-wide threshold, the 
computational method correctly identified 1 5 
(true positive) and missed 1 1 (false negative) 
of the 26 experimentally verified QTL inter- 
vals. The algorithm further predicted that 24 
genomic intervals (false positive) contributed 
to a phenotypic trait where no QTL had yet 
been experimentally characterized, and the 
predictions agreed with available experimen- 
tal data that 350 regions (true negative) were 
not QTL intervals for the 10 phenotypes ex- 
amined. The Fisher Exact test yields a highly 
significant P value (1.0 X 10" 10 ), confirming 
significant agreement between the computa- 
tionally predicted and experimentally . deter- 
mined chromosomal regions. 

Computational analysis of the murine 
SNP database using phenotypic data from 
inbred parental strains rapidly identifies 
candidate QTL intervals. This can elimi- 
nate many months to years of laboratory 



work required to generate, characterize, 
and genotype intercross progeny, reducing 
the time required for QTL interval identi- 
fication to milliseconds. In addition to its 
rapidity and low cost, the computational 
prediction method has a substantial advan- 
tage over QTL analysis using intercross 
progeny or recombinant inbred strains (8). 
Because it performs multiple comparisons 
across a range of inbred strains, the com- 
putational method takes advantage of the 
total genetic variation provided by avail- 
able inbred mouse strains. 

The ability of the computational genome 
scan to perform whole-genome association 
studies using the mouse SNP database indi- 
cates that linkage disequilibrium may extend 
over large regions among inbred mouse 
strains. Our computational results were unex- 

A 6 



pected because the number of different inbred 
strains for which phenotypic data was avail- 
able (4 to 10) was quite limited Positional 
cloning and case-control studies in human 
populations are routinely performed with 
hundreds to thousands of individuals (9). 
Several factors contribute to the successful 
QTL predictions by computational scanning 
of the mouse SNP database. The use of inbred 
mouse strains limits variability due to envi- 
ronment, and timed experimental interven- 
tion and sampling limits error in phenotypic 
assessment. The inbred strains are homozy- 
gous at all loci, which eliminates confcmnd- 
ing effects due to heterozygosity found in 
human populations. 

Recently, there has been increased em- 
phasis on using chemical mutagenesis in the 
mouse as a method for studying complex 
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2. Computational prediction of chromosomal regions regulating (A) MHC haplotype and (B) 
airway hyperresponsiveness. The correlation between the genotypic and phenotypic distributions 
is graphically shown for each trait; segments are arranged from centromeric to telomeric for all 19 
autosomes. Each bar represents a 30-cM interval, and neighboring bars are offset by 10 cM. The 
dotted line represents a useful cutoff for analyzing this data; the most highly correlated 10% of the 
loci are above this line. Striped bars represent locations of experimentally verified QTLs. 
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biology. This lias occurred as a result of die 
difficulties noted by investigators using stan- 
dard methods for QTL analysis [reviewed in 
(70)]. However, these studies can be marked- 
ly accelerated by application of the genotyp- 
ing method and computational tools de- 
scribed here. Of course, specific gene candi- 
dates must be identified to understand the 
genetic basis of complex disease. We have 
already shown how integration of gene ex- 
pression data obtained with high-density oli- 
gonucleotide microarrays can be used in con- 
junction with the SNP genotyping method to 
accelerate QTL analysis (11). Therefore, da- 
tabases with tissue-specific gene expression 
and phenotypic information across mouse 
strains could be used in conjunction with the 
murine SNP database to computationally 
identify candidate disease genes. In a hypo- 
thetical experiment, the expression of 40,000 
murine genes in an affected tissue obtained 
from different mouse strains can be profiled. 
As many as 1% of the genes will be reliably 
demonstrated to be differentially expressed in 
the tissue of the mouse strains with a different 



Reports 

phenotype. The resulting list of 400 gene 
candidates could be computationally reduced 
by 90% by searching lor genes that are en- 
coded within computationally predicted chro- 
mosomal regions, providing a reasonable 
starting point for analysis of complex disease 
biology. The application of this approach 
should reduce the frustrations and overcome 
the difficulties associated with QTL analysis 
in murine complex disease models. 

References and Notes 

1. K. Lindblad-Toh etai, Nature Cenet. 24, 381 (2000). 

2. S. Cermer, Genome Res. 10, 258 (2000). 

3. Web figure 1, Web tables 1 and 2, and supplemental 
text are available at Science Online at www. 
sciencemagorg/cgi/content/full/292/5523/1915/DC1. 

4. R. F. Klein etai, J. Bone Miner. Res., in press. 

5. JAX Notes 475, (1998). 

6. S. Ewart etai, Am. J. Respir. Ceii Moi. Biol. 23, 537 
(2000). 

7. Y. Zhang etai, Hum. Mot. Cenet 8, 601 (1999). 

8. A. Darvasi, Nature Genet 18, 19 (1998). 

9. N. Risch, K. Merikangas, Science 273, 1516 (1996). 

10. j. H. Nadeau, W. N. Frankel, Nature Genet. 25, 381 
(2000). 

11. C L. Karp et ai, Nature Immunoi 1, 221 (2000). 

12. J. K. Belknap, S. P. Richards, L. A. OToole, M. L Helms, 
T. J. Phillips, Behav. Genet. 27, 55 (1997). 



13. J. A. Melo, J. Shendure, K. Pociask, L H Silver, Nature 
Cenet. 13, 147 (1996). 

14. T. J. Phillips, J. C Crabbe, P. Metten, J. K. Belknap, 
Alcohol Clin. Exp. Res. 18, 931 (1994). 

1 5. T. J. Phillips, J. K. Belknap, K. J. Buck, C L. Cunningham, 
Ma mm. Genome 9, 936 (1998). 

16. L. M. Tarantino, G. E. McClearn, L. A. Rodriguez, R. 
Plomin, Alcohol Clin. Exp. Res. 22, 1099 (1998). 

17. K. J. Buck. P. Metten, J. K. Belknap, J. C Crabbe, 
J. Neurosci. 17, 3946 (1997). 

18. W. C. Beamer et ai, Mamm. Genome 10, 1043 
(1999). 

19. C. Zhou, R. W. Williams, Investig. Ophthalmol. Vis. 
Sci. 40, 817 (1999). 

20. R. W. Williams, R. C Strom, D. Goldowitz./ Neurosci. 
18, 138 (1998). 

21. M. L Mucenski, B. A. Taylor, N. A. Jenkins, N. C. 
Copeland, Moi Cell. Bioi 6, 4236 (1986). 

22. A Wielowieyski, L A. Brennan, J. Jongstra, Mamm. 
Genome 10. 623 (1999). 

23. L. D. Dwyer-Nield, B. Paigen, S. E. Porter, A. M. 
Malkinson, Am. J. Physiol. 279, L326 (2000). 

24. This work was partially supported by an NIH Genome 
Research Institute grant (1 R01 HG02322-01) to G.P. 
J.U. was partially supported by the Stanford Univer- 
sity Genome Training Program (NIH T32HG-00044). 
We thank H. C. Andersen, D. Beier, D. .Birch, C 
Carlson, H. Erlich, U. Cermer, M. Holland, and J. 
Sninsky for their help. 

9 January 2001; accepted 7 May 2001 



POWERSI/RGE 

NEW! Science Online's Content Alert Service 

Knowledge is power. If you'd like more of both, there's only one 
source that delivers instant updates on breaking science news and 
research findings: Science's Content Alert Service. This free 
enhancement to your Science Online subscription delivers e-mail 
summaries of the latest news and research articles published weekly 
in Science - instantly. To sign up for the Content Alert service, go 
to Science Online - but make sure your surge protector is working first. 

Science 

www.sciencemag.org 

For more information about Content Alerts go to www.6Ciencemag.org. Click on Subscription button, then click on Content Alert button. 



1918 



8 JUNE 2001 VOL 292 SCIENCE www.sciencemag.org 



